Hierarchical Information Clustering Using Ontology Languages

نویسندگان

  • Travis D. Breaux
  • Joel W. Reed
چکیده

The tools to analyze and visualize information from multiple, inhomogeneous sources have traditionally relied on improvements in statistical methods. The results from statistical methods, however, overlook relevant semantic features present within natural language and text-based information. Emerging research in ontology languages (e.g. RDF, RDFS, SUOKIF, and OWL) offers promising avenues for overcoming these limitations by leveraging existing and future libraries of meta-data and semantic mark-up. Using semantic features (e.g. hypernyms, meronyms, synonyms, etc.) encoded in ontology languages, methods such as keyword search and clustering can be augmented to analyze and visualize documents at conceptually higher levels. We present findings from a hierarchical clustering system modified for ontological indexing and run on a topic-centric test collection of documents each with fewer than 200 words. Our findings show that ontologies can impose a complete interpretation or subjective clustering onto a document set that is at least as good as meta-word search.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixed-Initiative Clustering

Mixed-initiative clustering is a task where a user and a machine work collaboratively to analyze a large set of documents. We hypothesize that a user and a machine can both learn better clustering models through enriched communication and interactive learning from each other. The first contribution of this thesis is providing a framework of mixedinitiative clustering. The framework consists of ...

متن کامل

Centralized Clustering Method To Increase Accuracy In Ontology Matching Systems

Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...

متن کامل

April 13 , 2010 Draft Mixed - Initiative Clustering

Mixed-initiative clustering is a task where a user and a machine work col-laboratively to analyze a large set of documents. We hypothesize that a user and a machine can both learn better clustering models through enriched communication and interactive learning from each other. The first contribution of this thesis is providing a framework of mixed-initiative clustering. The framework consists o...

متن کامل

Ontology-Based File Naming Through Hierarchical Conceptual Clustering

Current directory-based hierarchical file systems have many limitations as the amount of unstructured data possessed by individual user is increasing continuously. One of the most significant problems is that users usually have difficulties searching, navigating, and organizing their files since useful semantic information describing a file is not used in the current directory-based system. To ...

متن کامل

An Executive Approach Based On the Production of Fuzzy Ontology Using the Semantic Web Rule Language Method (SWRL)

Today, the need to deal with ambiguous information in semantic web languages is increasing. Ontology is an important part of the W3C standards for the semantic web, used to define a conceptual standard vocabulary for the exchange of data between systems, the provision of reusable databases, and the facilitation of collaboration across multiple systems. However, classical ontology is not enough ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004